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The modehng of cascade processes in multi-agent systems 
in the form of complex networks has in recent years become 
an important topic of study due to its many applications: 
the adoption of commercial products, spread of disease, the 
diffusion of an idea, etc. In this paper, we begin by identi- 
fying a desiderata of seven properties that a framework for 
modeling such processes should satisfy: the ability to rep- 
resent attributes of both nodes and edges, an explicit rep- 
resentation of time, the ability to represent non-Markovian 
temporal relationships, representation of uncertain informa- 
tion, the ability to represent competing cascades, allowance 
of non-monotonic diffusion, and computational tractability. 
We then present the MANCaLog language, a formalism based 
on logic programming that satisfies all these desiderata, and 
focus on algorithms for finding minimal models (from which 
the outcome of cascades can be obtained) as well as how 
this formalism can be applied in real world scenarios. We 
are not aware of any other formalism in the literature that 
meets all of the above requirements. 

Categories and Subject Descriptors 

1.2.4 [Artificial Intelligence]: Knowledge Representation 
Formalisms and Methods — Representation Languages 

General Terms 

Languages, Algorithms 

Keywords 
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1. INTRODUCTION AND RELATED WORK 

An epidemic working through a population, cascading elec- 
trical power failures, product adoption, and the spread of 
a mutant gene are all examples of diffusion processes that 
can happen in multi-agent systems structured as complex 
networks. These network processes have been studied in a 

This is an extended version of MANCaLog: A Logic 
for Multi- Attribute Network Cascades, which ap- 
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variety of disciplines, including computer science [lO], bi- 
ology [Tl], sociology [s], economics [l2], and physics [Ts] . 
Much existing work in this area is based on pre-existing 
models in sociology and economics - in particular the work 
of [s] 12]. However, recent examinations of social networks 
- both analysis of large data sets and experimental - have 
indicated that there may be additional factors to consider 
that are not taken into account by these models. These 
include the attributes of nodes and edges, competing diffu- 
sion processes, and time. In this paper, we outline seven 
design criteria (Section for such a framework and in- 
troduce MANCaLog (Section[2|, which is to the best of our 
knowledge the first logical language for modehng diffusion 
in complex networks that meets these criteria. MANCaLog 
is a rule-based framework (inspired by logic programming) 
that can richly express how agents adopt or fail to adopt 
certain behaviors, and how these behaviors cascade through 
a network. We also introduce fixed-point based algorithms 
that allow for the calculation of the result of the diffusion 
process in Section [S] Note that these algorithms are proven 
not only to be correct, but also to run in polynomial time. 
Hence, our approach can not only better express many as- 
pects of cascades in complex networks, but it can do so in 
a reasonable amount of time. We conclude by discussing 
applications of MANCaLog in Section |4] 

Proofs of all results stated in this paper can be found in the 
appendix. 

1.1 Desiderata of Properties 

We begin by identifying a set of criteria that we believe a 
framework for reasoning about cascades in complex networks 
should satisfy. 

1. Multiply labeled and weighted nodes and edges. 

Many existing frameworks for studying diffusion in complex 
networks assume that there is only one type of vertex that 
may become "active" [lO] or may "mutate" |15| and only 
one possible relationship between nodes. In reality, nodes 
and edges often have different properties. For instance, la- 
bels on edges can be used to differentiate between strong 
and weak ties (edge types) - a concept that is well stud- 
ied [7I . Recently, such attributes of nodes have been shown 
to impact influence in a network H^. 

2. Explicit Representation of Time. Most work in 
the literature assumes static models, with the exception 
of the recent developments in [4] [sj |6], which assume the 
existence of a timestamped log referring to actions taken 



in the network in order to learn how nodes influence each 
other. Though [2] tackles the problem of predicting the 
time at which a certain node will take an action, the au- 
thors make several simplifying assumptions such as mono- 
tonicity of probability functions, probabilistic independence, 
sub-modularity and, most importantly for this criterion, a 
modeling of time solely based on temporal decay of influ- 
ence. We seek a richer model of temporal relationships be- 
tween conditions in the network structure, the current state 
of the cascades in process, and how influence propagates. 

3. Non-Markovian Temporal Relationships. Apart 
from time being explicitly represented, the temporal depen- 
dencies should be able to span multiple units of time. Hence, 
the "memoryless" mode of a standard Markov process, where 
only the information of the current state is required, is insuf- 
ficient. Here, we strive to create a framework where depen- 
dencies can be from other earlier time steps. This issue has 
been previously studied with respect to more general logic 
programming frameworks such as 13 , but to our knowledge 




has not been applied to social networks. 

4. Representation of Uncertainty. As in practice it is 
not always possible to judge the attributes of all individuals 
in a network, an element of uncertainty must be included. 
However, in connection with point 7, this should not be at 
the expense of tractability. For instance, the probabilistic 
models of [lO] are normally addressed with simulation (and 
hence do not scale well) as the computation of the expected 
number of activated nodes is a ^^P-hard problem [s]. 

5. Competing Cascades. Often, in real-world situations 
there will be competing cascading processes. For example, 
in evolutionary graph theory [llj , "mutants" and "residents" 
compete for nodes in the network - the success of one hinges 
on the failure of the other. 

6. Non-Monotonic Cascades. In much existing work on 
cascades in complex networks, the number of nodes attain- 
ing a certain property at each time step can only increase. 
However, if we allow for competing cascades in the same 
model, we cannot have such a strong restriction as the suc- 
cess of one cascade may come at the expense of another. 

7. Tractability. The social networks of interest in today's 
data mining problems often have millions of nodes. It is 
reasonable to expect that soon billion-node networks will be 
commonplace. Any framework for dealing with these prob- 
lems must be solvable in a reasonable amount of time and 
offer areas for practical improvement for further scalability. 

1.2 Related Work 

The above criteria can be summarized as the desire to 
design the most expressive language for network cascades 
possible while still allowing computation of the outcome of 
a diffusion process to be completed in a tractable amount 
of time. As a comparison, let us briefly describe some rel- 
evant related work. Perhaps the best known general model 
for representing diffusion in complex networks is the in- 
dependent cascade/linear threshold (IC/LT) model of |10| . 
However, although this framework was shown to be capa- 
ble of expressing a wide variety of sociological models, it 
assumes the Markov property and does not allow for the 
representation of multiple attributes on vertices and edges. 
A more recent framework, social network optimization prob- 
lems (SNOPs) fll] uses logic programming to allow for the 
representation of attributes, but this framework does not al- 



Figure 1: Simple online social network Gaoc- Solid 
edges are labeled with strTie, while dashed edges are 
labeled with wkTie. White nodes are labeled with 
male, while gray nodes are labeled with fem. Arrows 
represent the direction of the edge; double-headed 
edges represent two edges with the same label. 

low for competing processes or non-monotonic cascades. A 
related logic programming framework, competitive diffusion 
(CD) 2 allows for competitive diffusion and non-monotonic 
processes but does not explicitly represent time and also 
makes Markovian assumptions. Further, we also note that 
the semantics of CD yields a "most probable interpretation" 
that is not a unique solution. Hence, a given model in 
that framework can lead to multiple and possibly contra- 
dictory, outcomes to a cascade (this problem is avoided in 
MANCaLog). Another popular class of models is Evolution- 
ary Graph Theory (EGT) ^Tj, which is highly related to 
the voter model (VM) [Ts]. Although this framework al- 
lows for competing processes and non-monotonic diffusion, 
it also makes Markovian assumptions while not explicitly 
representing time. Further, determining the outcome of a 
cascade in those models is NP-hard, while determining the 
outcome in MANCaLog can be accomplished in polynomial 
time. Table[l]lists how these models compare to MANCaLog 
when considering our design criteria. 

2. FRAMEWORK 

2.1 Syntax and Semantics 

In this paper we assume that agents are arranged in a 
directed graph (or network) G — {V,E), where the set of 
nodes corresponds to the agents, and the edges model the 
relationships between them. We also assume a set of labels 
C, which is partitioned into two sets: fluent labels £.f (labels 
that can change over time) and non-fluent labels jC„f (labels 
that do not); labels can be applied to both the nodes and 
edges of the network. We will use the notation Q = V U E 
to be the set of all components (nodes and edges) in the 
network. Thus, c & Q could be either a node or an edge. 

Example 2.1. We will use the sample online social net- 
work Gsoc shown in Figure^as the running example; Qsoc is 
used to denote the set of components of Gsoc ■ Here we have 
Cnf = {male, fem, strTie, wkTie} representing male, female, 
strong ties and weak ties, respectively. Additionally, we have 
Cf = {vtsPgA, visPgB} representing visiting webpage A and 
visiting webpage B, respectively. ■ 

In this paper, we present a logical language where we use 
atoms, referring to labels and weights, to describe properties 
of the nodes and edges. Though labels themselves could 
be modeled as atoms instead of predicates (to model non- 
ground labelings that allow for greater expressibility) , for 
simplicity of presentation we leave this to future work. The 
first piece of the syntax is the network atom. 



Table 1: Comparison with other models 



Criterion 


MANCaLog 


IC/LT fio' 


SNOP I? 


CD 


EGT/VM 11 




Yes 


No 


Yes 


Yes 


No 


2. Explicit Representation of Time 


Yes 


No 


Yes 


No 


Yes 


3. Non-Marlcovian Temporal Relationships 


Yes 


No 


No 


No 


No 


4. Uncertainty 


Yes 


Yes 


Yes 


Yes 


Yes 


5. Competing Cascades 


Yes 


No 


No 


Yes 


Yes 


6. Non-monotonic Cascades 


Yes 


No 


No 


Yes 


Yes 


7. Tractablity 


PTIME 


#P-liard 


PTIME 


PTIME 


NP-hard 



Definition 2.1 (Network Atom). Given label L g C 
and weight interval bnd C [0,1], then {L,bnd} is a net- 
work atom. A network atom is fluent (resp., non-fluent) if 
L £ Cf (resp., L £ C-nf)- We use NA to denote the set of 
all possible network atoms. 

Network atoms describe properties of nodes and edges. 
The definition is intuitive: L represents a property of the 
vertex or edge, and associated with this property is some 
weight that may have associated uncertainty - hence repre- 
sented as an interval bnd, which can be open or closed. An 
invalid bound is represented by 0, which is equivalent to all 
other invalid bounds. 

Definition 2.2 (World). A world W is a set of net- 
work atoms such that for each L £ C there is no more than 
one network atom of the form {L, bnd) in W . 

A network formula over NA is defined using conjunc- 
tion, disjunction, and negation in the usual way. If a for- 
mula contains only non-fiuent (resp., fiuent) atoms, it is a 
non-fluent (resp., fluent) formula. 

Definition 2.3 (Satisfaction of Worlds). Given a 
world W and network formula f, satisfaction of W by f 

( denoted W \= f) is defined: 

• /// = {L,[0,1]) thenW^f. 
. Iff = {L,(D) thenW^f. 

• If f ~ {L, bnd), with bnd ^ and bnd 7^ [0, 1], then 
W 1= / iff there exists {L, bnd') G W s.t. bnd' C bnd. 

• Iff = ^r thenW^fiffW^f. 

• /// = /i A /2 then W\^f iffW\= /i and W |= /a. 

•Iff = .fi^f2 then W^ftffWh h or W |= /a. 

For some arbitrary label L £ C, we will use the nota- 
tion Tr = (L, [0, 1]) and F — {L, 0) to represent a tautology 
and contradiction, respectively. For ease of notation (and 
without loss of generality), we say that if there does not 
exist some bnd s.t. {L, bnd) £ W, then this implies that 
{L,[0,1]) G W. 

Example 2.2. Following from Example \2.1\ the network 
atom {female, [1, 1]) can be used to identify a node as a 
woman. Likewise, the world W = 

[{fern, [1, l\),{male, [Q,V,\) , {msPgA, [1, l\),{visPgB, [0, 0])} 



might be used to identify a woman who visits webpage A. 
Clearly, we have that W \= 

{fem, [1, 1]) A -^{visPgA, [0.5, 0.9]) A -^{visPgB, [0.1, 0.7]) 

Note that the network atoms formed with strTie and wkTie 
are not present; this could be due to the fact that such a 
world is used to describe a node and not an edge, and hence 
there is no information about those two labels. As such is 
the case, W |= {strTie, [0, 1]) A {wkTie, [0, 1]). ■ 

The idea is to use MANCaLog to describe how properties 
(specified by labels) of the nodes in the network change over 
time. We assume that there is some natural number tmax 
that specifies the total amount of time we are considering, 
and we use t — {t \ t £ [0,t„jai]} to denote the set of all 
time points. How well a certain property can be attributed 
to a node is based on a weight (to which the bnd bound in 
the network atom refers). As time progresses, a weight can 
either increase/decrease and/or become more/less certain. 
We now introduce the MANCaLog fact, which states that 
some network atom is true for a node or edge during certain 
times. 

Definition 2.4 (MANCaLog Fact). If[ti,t2]C[0,t„,ax], 
c £ Q, and a £ NA, then {a,c) : [ti,t2] is a MANCaLog 
fact. A fact is fluent (resp., non-fluent) if atom a is fluent 
(resp., non-fluent). All non-fluent facts must be of the form 
(a, c) : [0,tmaa;]. Let be the set of all facts and J^nf , J~s be 
the set of all non-fluent and fluent facts, respectively. 

Example 2.3. Following from Example \2^ the following 
facts are based on Figurefn 

Fl = {{male, [1,1]), I) ■.[0,traax] 

F2 = ((/em,[l,l]>,l) : [0,i max\ 

F3 = ({male, [1,1]), 3) ■.[0,t,nax] 

F4 = {{strTie, [1, 1]>, (1, 2)) : [0, tmao:] 

F5 = {{strTie, [1, 1]>, (2, 1)) : [0, tmax] 

Fe = ((u)*:Tie, [1,1]>,(2,3)) : [0,t,„a^] 

F7 = {{visPgA,[0.8,1.0]),l) ■.[0,t,nax] 

Fs = {{visPgA,[0.5,1.0]),2) ■.[0,t,nax] 

For instance, agent 1 is male, and has a strong tie to agent 2, 
who IS female. u 

Next, we introduce integrity constraints (ICs). 

Definition 2.5. Given fluent network atom a and con- 
junction of network atoms b, an integrity constraint is of the 
form a b. 



Intuitively, integrity constraint (L, bnd) b means that 
if at a certain time point a component (vertex or edge) of 
the network has a set of properties specified by conjunction 
b, then at that same time the component's weight for label 
L must be in interval bnd. 

Example 2.4. Following from the previous examples, the 
integrity constraint (maZe, [0,0]) -^-^ (/em, [1, 1]) would re- 
quire any node designated as a female to not be male. m 

We now define MANCaLog rules. The idea behind rules is 
simple: an agent that meets some criteria is influenced by 
the set of its neighbors who possess certain properties. The 
amount of influence exerted on an agent by its neighbors is 
specified by an influence function, whose precise effects will 
be described later on when we discuss the semantics. As a 
result, a rule consists of four major parts: (i) an influence 
function, (ii) neighbor criteria, (iii) target criteria, and (iv) a 
target. Intuitively, (i) specifies how the neighbors infiuence 
the agent in question, (ii) specifies which of the neighbors 
can infiuence the agent, (iii) specifies the criteria that cause 
the agent to be influenced, and (iv) is the property of the 
agent that changes as a result of the influence. 

We will discuss each of these parts in turn, and then deflne 
rules in terms of these elements. First, we define infiuence 
functions and neighbor criteria. 

Definition 2.6 (Influence Function). An influence 
function is a function ifl : N x N [0, 1] x [0, 1] that sat- 
isfies the following two axioms: 

1. ifl can be computed in constant (0{1)) time. 

2. For x' > X we have ifl{x',y) C ifi[x,y). 

We use IFL to denote the set of all influence functions. 

Intuitively, an infiuence function takes the number of qual- 
ifying influencers and the number of eligible influencers and 
returns a bound on the new value for the weight of the prop- 
erty of the target node that changes. In practice, we expect 
the time complexity of such a function to be a polynomial 
in terms of the two arguments. However, as both arguments 
are naturals bounded by the maximum degree of a node in 
the network, this value will be much smaller than the size 
of the network - we thus treat it as a constant here. 

Example 2.5. The well-known "tipping model" originally 
introduced in Q \10j states that an agent adopts a behavior 
when a certain fraction of his incoming neighbors do so. A 
common tipping function is the majority threshold where 
at least half of the agent's neighbors must previously adopt 
the behavior. We can represent this using the following in- 
fluence function: 



tip{x,y) = 



[1.0,1.0] ifx/y>0.^ 
[0.0, 1.0] otherwise 



This function says that an agent adopts a certain behavior 
if at least half of his incoming neighbors have some property 
(strong ties, weak ties, meet some requirement of gender, 
income, etc.) and that we have no information otherwise. 
In our framework, we can leverage the bounds associated with 
the influence function to create a "soft" tipping function: 



Intuitively, the above function says that an agent adopts a 
behavior with a weight of at least 0.7 if half of the incoming 
neighbors that have some attribute and meet some criteria, 
and we have no information otherwise. Another possibility 
is to have an influence function that may reduce the weight 
that an agent adopts a certain behavior: 



ngTp{x,y) 



[0.0,0.2] ifx = y 
[0.0, 1.0] otherwise 



sftTp{x,y) = 



[0.7, 1.0] 
[0.0, 1.0] 



if x/y > 0.5 
otherwise 



The ngTp function says that an agent will adopt a behav- 
ior with a weight no greater than 0.2 if all of the incoming 
neighbors possessing some property meet some criteria, and 
that we have no information otherwise. ■ 

Definition 2.7 (Neighbor Criterion). Ifgedge,gnode 
are non- fluent network formulas (formed over edges and nodes, 
respectively), h is a conjunction of network atoms, and ifl is 
an influence function, then {gedge, gnode, h)ifl is a neighbor 
criterion. 

Formulas gnode and /i in a neighbor criterion specify the 
(non-fluent and fluent, respectively) criteria on a given neigh- 
bor, while formula Qedge specifles the non-fluent criteria on 
the directed edge from that neighbor to the node in question. 

The next component is the "target criteria", which are the 
criteria that an agent must satisfy in order to be influenced 
by its neighbors. Ideas such as "susceptibility" [T] can be 
integrated into our framework via this component. We rep- 
resent these criteria with a formula of non-fluent network 
atoms. The final component, the "target", is simply the la- 
bel of the target agent that is infiuenced by its neighbors. 
Hence, we now have all the pieces to define a rule. 

Definition 2.8 (Rule). Given fluent label L, natural 
number At, target criteria f and neighbor criteria 
h)ifl, a MANCaLog Rule is of the form: 

r — L f ,{g edge, gnode., h) ifl 

We will use the notation head{r) to denote L. 

Note that the target (also referred to as the head) of the rule 
is a single label; essentially, the body of the rule characterizes 
a set of nodes, and this label is the one that is modified for 
each node in this set. More specifically, the rule is essentially 
saying that when certain conditions for an agent and its 
neighbors are met, the bnd bound for the network atom 
formed with label L on that agent changes. Later, in the 
semantics, we introduce network interpretations, which map 
components (nodes and edges) of the network to worlds at 
a given point in time. The rule dictates how this mapping 
changes in the next time step. 

Definition 2.9 (MANCaLog Program). A program P 
is a set of rules, facts, and integrity constraints s.t. each 
non-fluent fact F £ J-nf appears no more than once in the 
program. Let P be the set of all programs. 

Example 2.6. Following from the previous examples, we 
can have a MANCaLog program that leverage the sftTp and 
ngTp influence functions in rules that are more expressive 



Table 2: Example network interpretation, NIi. 









stvTtG 


wkTie 


vIsPqA 


visPg B 


1 


[1, 1] 


[0,0] 






[0.9, l.Oj 


[0.8, l.Oj 


2 


[0,0] 


[1, 1] 






[0.0,0.3] 


[0.0, 0.2] 


3 


[1, 1] 


[0,0] 






[0.6, 1.0] 


[0.0, 0.2] 




[U, UJ 


ri 1 1 






Fn n n 9l 

[U.U, u. 


[no 1 nl 


5 


[1,1] 


[0,0] 






[0.0,0.2] 


[0.7, 1.0] 


(1,2) 






[1,1] 


[0,0] 






(2,1) 






[1,1] 


[0,0| 






(1.3) 






[0,0] 


(1,1] 






(2,3) 






[0,0] 


[1,1] 






(3,4) 






[1,1] 


[0,0] 






(4,3) 






[1,1] 


[0,0] 






(4,5) 






[1,1] 


[0,0] 







Example 2.8. Consider interpretation Ii, where /i(0) = 
A*'/i (from Ex amp le \2.T^ , and MANCaLog facts and Fg 
from Example \2.3\ In this case, Ii \— F7 and /i |^ Fg. ■ 

For non-fluent facts, we introduce the notion of strict sat- 
isfaction, which enforces the bound in the interpretation to 
be set to exactly what the fact dictates. 

Definition 2.13 (Strict Fact Satisfaction). Inter- 
pretation I strictly satisfies MANCaLog fact (c, a) : [ti,t2] 
lifVt G [ti,t2], a G J(t)(c). 

Next, we define what it means for an interpretation to 
satisfy an integrity constraint. 



than previous models. Consider the following rules: 
2 

Ri = visPgA <— 

(fern, [1, 1]>, {(strTie, [0.9, 1]>, Tr, (visPgA, [0.9, 1.0]»,/irp 

R2 = visPgB 

{male, [1, 1]>, (Tr, Tr, {visPgB, [0.8, 1.0]))sftTp 
3 

f?3 = visPgA <~ 

{male, [1, 1]>, (Tr, {fern, [1, l\) ,^{visPgA, [0.7, 1.0]>)„gTp 

Rule R\ says that a female agent in the network visits page 
A with a weight of at least 0.7 (this is specified in the sftTp 
influence function) if at least half of her strong ties (with 
weight of at least 0.9^ visited the page (with a weight of at 
least 0.9j two days ago. The rest of the rules can be read 
analogously. ■ 

We now introduce our first semantic structure: the net- 
work interpretation. 

Definition 2.10 (Network Interpretation). A net- 
work interpretation is a mapping of network components to 
sets of network atoms, Nl : Q 2'^^ . We will use Nl to 
denote the set of all network interpretations. 

We note that not all labels will necessarily apply to all 
nodes and edges in the network. For instance, certain labels 
may describe a relationship while others may only describe 
a property of an individual in the network. If a given label L 
does not describe a certain component c of the network, then 
in a valid network interpretation (L, [0, 1]) G NI{c). 

Example 2.7. Consider G'goc: ^he induced subgraph of G 30c 
that has only nodes {1, 2, 3, 4, 5}. ra6Ze[£|s/iouis the contents 
of Nil , an example network interpretation. m 

We define a MANCaLog interpretation as follows. 

Definition 2.11 (Interpretation). A MANCaLog in- 
terpretation I is a mapping of natural numbers in the inter- 
val [0,tmax] to network interpretations, i.e., 7 : AT — )• Nl. 
Let X be the set of all possible interpretations. 

2.2 Satisfaction 

First, we define what it means for an interpretation to 
satisfy a fact and a rule. 

Definition 2.12 (Fact Satisfaction). An interpreta- 
tion I satisfies MANCaLog fact (a, c) : [ti,t2], written I \— 
(a,c) : [ti,t2], tffyt€ lti,t2], I{t){c) \= a. 



Definition 2.14 (IC Satisfaction). An interpretation 
I satisfies integrity constraint a -i^ b iff for all t £ t and 
ceQ, I{t){c) 1= -nbVa. 

Before we define what it means for an interpretation to 
satisfy a rule, we require two auxiliary definitions that are 
used to define the bound enforced on a label by a given rule, 
and the set of time points that are affected by a rule. 

Definition 2.15 {Bound function). For a given rule 

r = L ^ f , {gedge, Qnode, h)ifl, nodc V, and network interpre- 
tation Nl, Bound{r,v, Nl) = 

ifl j Qual [v, gedge, Qnode, h, N l)^,^ EUg {v,gedge, Qnode , N l) | j , 

where Elig{v , g^dge, Qnode, N I) = 

{v' eV\ NI{V') ^gnodeA{v',v) G EANI{{v',v)) ^ gedge] 

and Qual{v, g^dge, gnode, h, N I) = 

|u' G Ehg(v, gedge, Qnode, Nl) | NI{v') 1= /i| 

Intuitively, the bound returned by the function depends on 
the influence function and the number of qualifying and el- 
igible nodes that influence it. 

Definition 2.16 (Target Time Set). For interpreta- 
tion I, node V, and rule r — L f, {gedge, Qnode, h)ifl, the 
target time set of I,v,r is defined as follows: 

TTS{I,v,r) = {t G [0,t„a.] I I{t-At){v) ]= /} 

We also extend this definition to a program P, for a given 
ce g and L € C, as follows; TTS{I, c, L, P) = 

U TTSiI,c,r)u{t G [ti,t2] 1 {{L,bnd),c) : [ti,t2] G P} 

reP,head(r) = L 

U {i I {L, bnd) ^ b G P A I{t)(c) \= ft} 
We can now deflne satisfaction of a rule by an interpretation. 

Definition 2.17. An interpretation I satisfies a rule r — 
f, {gedge,gnode,h)^fl iff for all V G V and t G TTS{I,v,r) 
it holds that 

N (L,Bound[r,v,I{t- At))Y 



pie 2.8 



Example 2.9. Let Ii be the interpretation from Exam- 
Suppose that {visPgB, [0.8,1.0]) G /(1)(5). In this 
= 7?2. Let I2 be equivalent to Ii except that we have 
sPgB, [0.0, 0.5]) G /2(1)(3). In this case, h y= R^- ■ 



We now define satisfaction of programs, and introduce 
canonical interpretations, in whicli time points tliat are not 
"targets" retain information from the last time step. 

Definition 2.18. For interpretation I and program P: 

I is a model for P iff it satisfies all rules, integrity con- 
straints, and fluent facts in that program, strictly satisfies 
all non-fluent facts in the program, and for all L £ C, c £ Q 
andt^ TTS{I,c,L,P), {L, [0,1]) € I{c){t). 

I IS a canonical model for P iff it satisfies all rules, in- 
tegrity constraints, and fluent facts in P, strictly satisfies 
all non-fluent facts in P, and for all L £ £, c £ Q, and 
t i TTS{I,c,L,P), {!/, [0, 1]) G I{c){t) when t ^ and 
{L,bnd) G I{t){c) where {L,bnd) G I{t — l)(c), otherwise. 

Example 2.10. Following from previous examples, if we 
consider interpretation Ii and program P — {F7,R2}, we 
have that {visPgB, [0.0,0.2]) must be in /i(l)(2) in order for 
Ji to be canonical. m 

2.3 Consistency and Entailment 

In tliis section we discuss consistency and entailment in 
MANCaLog programs, and explore the use of minimal mod- 
els towards computing answers to these problems. 

Definition 2.19 (Consistency). A MANCaLog pro- 
gram P is (canonically) consistent iff there exists a (canon- 
ical) model I of P. 

Definition 2.20 (Entailment). A MANCaLog program 
P (canonically) entails MANCaLog fact F iff for all (canon- 
ical) models I of P, it holds that I \= F . 

Now we define an ordering over models and define the con- 
cept of minimal model. We then show that if we can find a 
minimal model then we can answer consistency, entailment, 
and tight entailment queries. To do so, we first define a 
pre-order over interpretations. 

Definition 2.21 (Preorder over Interpretations). 
Given interpretations I, I' we say I /' if and only if for 
all t,v,L if there exists {L, bnd) G I(t){v) then there must 
exist {L, bnd') G I'{t){v) s.t. bnd' C bnd. 

Next, we define an equivalence relation for interpretations 
denoted with ~; we will use the notation [I] for the set of 
all interpretations equivalent to / w.r.t. '~. This allows us 
to define a partial ordering. 

Definition 2.22. Two interpretations I, I' are equiva- 
lent (written I ~ iff for all P eP, I \^ P iff I' [= P. 

Definition 2.23 (Partial Ordering). Given classes 
of interpretations [I], [I'] that are equivalent w.r.t. ~, we say 
that [I] precedes [I'], written [I] C [/'], iff I Q^''" I'. 

The partial ordering is clearly reflexive, antisymmetric, 
and transitive. Note that we will often use / C /' as short- 
hand for [/] C [/']. We define two special interpretations, _L 
and T, such that G r, c G C7, ^-{t){c) = and there exists 



network atom (L, 0) G T(t)(c). Clearly, no other interpreta- 
tion can be below _L as the [(., u] bound on all network atoms 
for each time step and each component is [0, 1]; similarly, no 
other interpretation is above T, since for any interpretation 
/ for which there exists (L, bnd) G I(t){c) where bnd 7^ 0, 
we have C bnd. We can prove (see the full version of the 
paper for details) that with T and _L, (I, C) is a complete 
lattice. We can now arrive at the notion of minimal model 
for a MANCaLog program. 

Definition 2.24 (Minimal Model). Given program P , 
the minimal model of P is a (canonical) interpretation I s.t. 
I \= P and for all (canonical) interpretation I' s.t. I' \= P, 
we have that I ^ I' . 

Suppose we have some algorithm A that takes as input 
a program P and returns an interpretation I (where I does 
not necessarily satisfy P) s.t. for all /' where I' \= P, I ^ I' . 
It is easy to show that ii A[P) \= P then P is consistent. 
Likewise, if A{P) — T then P is inconsistent, as all mod- 
els must then have a tighter weight bound for the network 
atoms than an invalid interpretation (hence, making such an 
interpretation invalid as well). Clearly, any such algorithm 
A would provide a sound and complete answer to the consis- 
tency problem. Likewise, if we consider the entailment prob- 
lem, it is easy to show that for fact F = {{L, bnd), c) : [ti, £2], 
P (canonically) entails F iff the minimal model of P (canon- 
ically) satisfies F. This is because for minimal model A{P) 
of P, for any time t G [ti,t2], if A{P){t){c) |= {L, bnd) then 
there is network atom (L, bnd') G A{P){t){c) s.t. bnd' C 
bnd. We note that for any other interpretation 7 of P with 
{L, bnd") G I{t)(c) we have that bnd' I) bnd" . Hence, hav- 
ing a minimal model allows us to solve any entailment query. 
We can think of a minimal model of a MANCaLog program 
as the outcome of a diffusion process in a multi-agent system 
modeled as a complex network. Hence, a question such as 
"how many agents will adopt the product with a weight of 
at least 0.9 in two months?" can be easily answered once 
the minimal model is obtained. 

3. FIXED POINT MODEL COMPUTATION 

In this section we introduce a fixed-point operator that 
produces the non-canonical minimal model of a MANCaLog 
program in polynomial time. This is followed by an algo- 
rithm to find a canonical minimal model also in polynomial 
time. First, we introduce three preliminary definitions. 

Definition 3.1. For a given MANCaLog program P, c€ 
Q, L £ C, and t £ t we define function FBnd{P, c, t, L) — 

Pi bnd 

((L,hnd),c):lti,t2]eP s.t. ie[ti,t2] 

Definition 3.2. For a given MANCaLog program P, c £ 
Q, L £ C, and t £ t we define function IBnd{P, c, t, L) — 

Pi bnd 

{L,bnd)<^aeP s.t. I(t)(c)\—a 

Definition 3.3. Given MANCaLog program P, interpre- 
tation I , V £ V , L £ C, and t £ t, we define RBnd{P, I,v,t, L) 

P Bound{r,v,I{t- At)) 

reP s.t. teTTS{I,v,L,P)nTTS{I,v,r) 



We can now introduce the operator. 

Definition 3.4 (r Operator). For a given MANCaLog 
program P, we define the operator Tp : X — >■ I as follows: 
For a given I, for each t £ t , c £ Q , and L £ C, add {£., bnd) 
to rp{I){t){c) where bnd is defined as: 

bnd = bndprv Pi FBnd{P, c, t, L) n 

IBnd{P, I, c, t, L) n RBnd{P, I, c, t, L) 

where {L, bndprv) G I{t){c). 

It is easy to show that F can be computed in polynomial 
time (the proof is in the full version). Next, we introduce 
notation for repeated applications of F. 

Definition 3.5 (Iterated Applications of F). Given 
natural number i > 0, interpretation I , and program P , we 
define Fp(7), the multiple applications ofF, as follows: 



Algorithm 1 CANON.PROC 



ifi ■ 



1 



r (I) = 

' [Fp(F'p-^(/)) otherwise 

We can prove that the iterated F operator converges after a 
polynomial number of applications: 

Theorem 3.1. Given interpretation I and program P, 



there exists a natural number k s.t. Fp(/) 



(J), and 



k e 0{\P\- dT -tr 



\E' 



where is the maximum tn-degree in the network. 

Proof (sketch). For a given vertex i G F, we will use 
the notation d*" to denote the number of incoming neighbors 
(of any edge type). First note that for a given t £ T,i £ V, 
and L £ £, a, given rule r can tighten the bound on a network 
atom formed with L no more than (d™ + 1) • (dt" + 1) times. 
At each application of F, at least one network atom must 

tighten. Hence, as there are only o(^\P\ ■ di" ■ 
tightenings possible, this is also the bound on the number 
of applications of F. □ 



In the following, we will use the notation Fp to denote the 
iterated application of F after a number of steps sufficient 
for convergence; Theorem |3. 1| means that we can efficiently 
compute Fp. We also note that as a single application of F 
can be computed in polynomial time, this implies that we 
can find a minimal model of a MANCaLog program in poly- 
nomial time. We now prove the correctness of the operator. 
We do this first by proving a key lemma that, when com- 
bined with a claim showing that for consistent program P, 
Fp is a model of P, tells us that Fp is a minimal model for P. 
Following directly from this, we have that P is inconsistent 
iff Fp = T. 

Lemma 3.2. If I \= P and I' \Z I then F(J') C /. 

Theorem 3.3. // program P is consistent then Fp is a 
minimal model for P. 

These results, when taken together, prove that tight entail- 
ment and consistency problems for MANCaLog can be solved 
in polynomial time, which is precisely what we set out to ac- 
complish as part of our desiderata described in Section [lT] 
Next, we develop an algorithm for the canonical versions 



Require: Program P 
Ensure: Interpretation / 

1: curjinterp = rp(±); 

2: Initialize matrix array cur_free[-][-] where for v £ V, and 
L e £, cur_free[v][L] = r - TTS(curAnterp, v, L, P) - {0}; 

3: Initialize array uLprf-] where for each t £ [1, tmax], vl_pr[t] = 
{{v,L) I t £ cur_free[v][L]}; 

4: for t = 1, . . . ,t,nax do 

5; if vl_pr[t] then 

6: for (v, L) £ vl_pr[t] do 

7: Remove {L,bnd) from I{t){v); 

8: Let a be the atom in I{t — l){v) of the form (L, bnd'); 

9: Add a to I{t){v); 

10: end for 

II: Set curAnterp = T*p{curAnterp)\ 

12: end if 

13: For V £ V, and L £ C, cur_free\v\[L\ = t — 

TTS {curjinterp, v, L, P) - {0, . . . , t} 
14: For each t £ [t + l,tmax], vLpr[t] = {{v,L) \ t £ 

cur^freelv] [L]} 
15: end for 
16: return / 



of consistency and tight entailment, and show that we can 
bound the running time of the algorithm with a polynomial. 
We also note that subsequent runs of the convergence of F 
will likely complete quicker in practice, as the initial inter- 
pretation is the last interpretation calculated (cf. line III. 



We also show that the interpretation produced by the algo- 
rithm is a canonical minimal model. Following from that, a 
program is inconsistent iff the algorithm returns T. 

Proposition 3.1. Algorithm CANON_PROC performs no 
more than I +tmai ■ min(|£l , |P|)-|V| calculations of the con- 
vergence of F. 

Theorem 3.4. IfP is consistent, then CANONjPROC{P) 

is the minimal canonical model of P. 

4. APPLICATIONS 

In this section, we will briefly discuss work in progress on 
how MANCaLog can be applied in real world settings. 

It is widely acknowledged that modeling influence in multi- 
agent systems (most usefully modeled as complex networks) 
is highly desirable for many practical problems as varied 
as viral marketing, prevention of drug use, vaccination, and 
power plant failure. Though MANCaLog programs are a rich 
model to work with, the acquisition of rules is the principal 
hurdle to overcome; this is mainly due to this richness of 
representation, since for each rule we must provide a set 
of conditions on the agents being influenced, conditions on 
their neighbors and their ties to their neighbors, and how 
capable these neighbors are of influencing them. A domain 
expert is likely able to provide important insights into these 
components, but the best way to obtain these rules is un- 
doubtedly to leverage the presence of large amounts of data 
in domains like Twitter (with about 340M messages sent per 
day, available through public APIs), Facebook (over 950M 
users with more complex information; not publicly available, 
but data can be requested through apps), and blogging and 
photo hosting sites such as Blogger and Flickr (which have 
millions of users as well) . 
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Figure 2: An architecture for obtaining MANCaLog 
programs from available data sources. 



Concretely, we have begun working towards this goal by 
extracting several time-series, multi-attribute network data 
sets on which to apply MANCaLog. For instance, to study 
the proliferation of research on different topics, we looked 
at research on "niacin" indexed by Thomson Reuters Web 
of Knowledge (http://wokinfo.com). This topic was cho- 
sen due to its interest to a variety of disciplines, such as 
medicine, biology, and chemistry; this gives the data more 
variety compared with more discipline-specific topics. We 
extracted an author-paper bipartite network consisting of 
3, 790 papers with 10, 465 authors and 16, 722 edges (cf. Fig- 
ure |3|; from this data we can easily focus on various kinds of 
networks (co-author, citation, etc.). We have also collected 
attribute and time-series data for this network, as well as 
the subjects of the papers; the propagation of these subjects 
is a good starting point to test methods for the acquisition 
of MANCaLog rules. We are harvesting larger datasets from 
various online social networks. Further details can be found 
in the full version of the paper. 

A proposed learning architecture. We are currently 
developing a MANCaLog learning architecture (depicted in 
Fig. [2| based on the use of state-of-the-art data analysis, 
clustering, and influence learning techniques as building blocks 
for the acquisition of MANCaLog rules from data sets. The 
key question is not just the identification of the best tech- 
niques to adopt, but how to adapt them and combine them 
in such a way as to produce meaningful and useful outputs. 

Consider the diagram in Fig. |2] the data first fiows from 
raw data sources to the cluster identification component, 
which has the goal of identifying sets of agents behaving 
as groups (for instance, teens influencing other teens of the 
same sex in the consumption of music, or scientists of a cer- 
tain fleld influencing the research topics of others in a related 
held) [16[ [9]; the main output here is a set of conditions on 
nodes and edges that characterize groups of nodes. Once 
clusters are identifled, the influence recognition component 
will make use of both the clusters and the data sources to 
recognize what kind of influence is present in the system [l| 
[S] [6]; the main output of this component is the influence 
function to be used in the MANCaLog rules. The rule gen- 
eration component then takes the output of the cluster iden- 
tiflcation and influence recognition components, along with 
the raw data {e.g., to analyze time stamps) and produces 
MANCaLog rules; the output of this component is involved 
in a reflnement cycle with experts who can provide feedback 
on the rules being produced (such as possible combinations 
of rules, identification of cases of overfltting, etc.). 




Figure 3: (Left) Visualization of a multi-attribute 
time-series author-paper network from 1952 to 2012. 
(Top-Right) Close-up of the data inside the small box 
in the main figure. (Bottom- Right) Close-up showing 
node attributes. In all cases, authors are colored 
green and papers are colored red. Data extracted 
from Thomson Reuters Web of Knowledge. 



5. CONCLUSION 

In this paper, we presented the MANCaLog language for 
modeling cascades in multi-agent systems organized in the 
form of complex networks. We started by establishing seven 
criteria in the form of desiderata for such a formalism, and 
proved that MANCaLog meets all of them; to the best of our 
knowledge, this has not been accomplished by any previous 
model in the literature. We also note that MANCaLog is the 
first language of its kind to consider network structure in 
the semantics, potentially opening the door for algorithms 
that leverage features of network topology in more efficiently 
answering queries. Our current work involves implementing 
the algorithms described in this paper, as well as the real- 
world applications described in Section [4] though our al- 
gorithms have polynomial time complexity, it is likely that 
further optimizations will be needed in practice to ensure 
scalability for very large data sets. 

In the near future, we shall also explore various types 
of queries that have been studied in the literature, such 
as finding agents of maximum influence, identifying agents 
that cause a cascade to spread more quickly, and identifying 
agents that can be influenced in order to halt a cascade. 
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8.1 Set of interpretations form a complete lat- 
tice 

With top interpretation T and bottom interpretation _L, 
(T, C) is a complete lattice. 

Proof. Let I' be a subset of I. We can create inf{X') 
as follows. We build interpretation /'. For each t € t,c € 

Q,L £ £, let £i be the least of the set Ujgi/{^|(L, [£,«]) G 
I{t){c),{L,[£,u)) G I{t){c)} and £2 be the least of the set 
Ui^i,{i\(L,{£,u]) G I{t){c),{L,{£,u)) G I{t){c)}. Then, for 
each t € T, c £ Q , L £ C let ui he the greatest element of the 
set Uiex'i^KL, [Am]) e I{t){c), {L, {l,u]) € 7(t)(c)} 
and U2 be the greatest of the set 

Lliex'{u\{L,[£,u)) G /(t)(c), (L, (^, u)) G I{t){c)}. If there is 
any interpretation J in I where there is not some bnd s.t. 
{L, bnd) G I{t)(c) then add {L, [0, 1]) to I'{t){c). If £2 < £1 
and ui > U2 then add (L, {£2, Mi]) to I'{t){c). If £2 < £1 and 
W2 > wi then add {L,{£2,U2)) to I'{t){c). If £2 > £1 and 
U2 > Ml then add {L, [£1,^2)) to I'{t){c). Finally, if £2 > £1 
and Ml > U2 then add {L, [£i,wi]) to I'{t){c). Clearly, I' = 
infix'). 

In the next part of the proof, we show we can create 
sup{T') as follows. We build interpretation /'. For each 
t £ T, c £ Q , L £ C let £1 be the greatest of the set 
Uiex'{£\{L,[£,u]) £ I{t){c),(L,[£,u)) £ I{t){c)} and £2 be 
the greatest of the set 

Ujex'{£\{L,{£,u]) £ I{t){c),{L,{£,u)) G I{t){c)}. Then, for 
each t G r, c G C/, -L G let ui be the least element of the set 
Ui^x'{u\{L,[£,u]) £ I{t){c),{L,{£,u]) G I{t){c)} and U2 be 
the least of the set U/gi/{«|(L, [£,u)) £ I{t){c), {L, {£,u)) £ 
I{t){c)}. If max(£i,£2) > min(Mi,M2) or {£2 > £1) A (m2 < 
Ml) A {£2 = M2) then add (L, 0) to I'{t){c). If £2 > £1 and 
Ml < M2 then add (L,(^2,mi]) to I'{t){c). II £2 > £1 and 
M2 < Ml then add {L,{£2,U2)) to I {t){c). If £2 < £1 and 
M2 < Ml then add {L, [£i,U2)) to I'{t){c). Finally, if £2 < £1 
and Ml < M2 then add (L, [£i,mi]) to I'{t){c). Clearly, 
I' = sup{I'). 

As both inf{X') and sup{X') exist and are clearly in X then 
the statement follows. □ 



8.2 A single application of r can be computed 
in polynomial time 

For interpretation I, T{I) can be computed by conducting 
0{\P\-\V\-tmax-dl") satisfaction checks where dj," is the max- 
imum in-degree of a node in the network. (This combined 
with the assumption that the influence function is computed 
in constant time results in polynomial time computation for 
a single application of F.) 

Proof. We note that a given rule will require the most 
satisfaction checks, as a rule will potentially affect a network 
atom of a certain label for each vertex-time point pair. By 
the definition of RBnd, a given rule clearly causes no more 
than 0(di") satisfaction checks. As the number of rules is 
no more than the statement follows. □ 



8.3 Proof of Theorem |33] 

Given interpretation / and program P, there exists a nat- 
ural number k s.t. rp(J) = Fp+^il), and 



keo \P\-d" 



where d"' is the maximum in-degree in the network. 

Proof. For a given vertex i £ V ,we will use the notation 
d™ to denote the number of incoming neighbors (of any 
edge type) and (T" = max;^ d]" . First we show that for a 
given t £ T,i £ V, and L £ C, & given rule r can tighten 
the bound on a network atom formed with L no more than 
{dl" + 1) ■ {dl" + 1) times. This is because a given rule 
adjusts the bound on a network atom based on the number 
of eligible and qualifying neighbors, which are bounded by 
d]",d^" respectively. At each application of F, at least one 
network atom must tighten. Hence, as there are only O ^|P| ■ 

dT ■ tma. ■ E^^r) =o(\P\ ■ dT ■ • |^;|) tightenings 
possible, this is also the bound on the number of applications 

of r. □ 



8.4 Proof of Lemma |321 

If / ^ P and /' C / then F(J') C I. 

Proof. Suppose, BWOC, that r(/') □ I. Then, there 
exists some L £ C, t £ t and c £ Q s.t. (L, bnd) £ I{t){c), 
{L,bnd') £ I'{t){c), and {L,bnd") £ r{I'){t){c) s.t. bnd 3 
bnd" and bnd' 'D bnd". There are four things that affect 
bnd": facts, rules, integrity constraints and bnd' . Clearly, 
we need not consider the effect that either facts or bnd' 
have on bnd", as / satisfies all facts and I' C I. We also 
not e tha t a given integrity constraint imposed by Defini- 
tion 3.2 can tighten bnd" no more than the associated bound 
in any model. Hence, there must be some rule r = L ^ 
f, {gedge, gnode,h)ifl that causes bnd" to become less than 
bnd. As bnd" / bnd', we know that t £ TTS{V{I'),c,r) n 
TTS{I', c, r). As a result, we have r(/')(t - At)(c) |= / and 
r{t - At){c) j= /. Further, as I \= P,I' C I, and no rule 
can modify a non-fluent atom, we have 

I Ehg{v, Qedge , g„ode , T [I' ) [t - At)\ = 
I Elig {V , Qedge , gnode , I' {t — At) \ = 
I Elig (V , Qedge ,gnode,T{l'){t - At)\. 

Further, we know that as I' C I, it must be the case that 

I Qual{v, gedge,gnode, h, I {t - At))\ > 
I Qual{v, gedge , gnode , h, l' {t — At) ) | . 

This implies, by Axiom 2 that, Bound(r,v, I(t — At)) C 
Boundir, v,I'{t — At)). This then implies that bnd C bnd" , 
which is a contradiction. □ 

8.5 Proof of Theorem |33] 

Fp is a minimal model for P. 

Proof. Claim: If program P is consistent then T*p is a 
model of P. 

Suppose, BWOC, that there is a fact in P that Fp does not 
satisfy. However, by the definition of F and the definition of 
a fact, Fp must satisfy all facts as the bound on the weight 
associated with each fact is included in the intersection. Fur- 
ther, we can also see by the definition of F that Fp strictly 



satisfies all non-fluent facts in P. We also note that the fi- 
nal application of the F operator ensures that all integrity 
constraints are satisfied by Fp. Now, Suppose, BWOC, that 
there is a rule in P that Fp does not satisfy. However, with 
each application of F, for each rule, we include the bound 
on the weight returned by the Bound function for each time 
step in the target time step associated with that rule. As F 
is applied to convergence, and new bounds are intersected 
with each application, then we know that all time points in 
any associated target time set are considered in the inter- 
section. 

Proof of Theorem; The above claim tells us that Fp \= P. 
Now consider interpretation / s.t. I \— P. As _L C I, multi- 
ple applications of Lemma |3.2| tell us that Fp C /. Hence, 
the statement follows. □ 

8.6 Proof of Theorem |3j 

If P is consistent, then CANON_PROC(P) is the minimal 
canonical model of P. 

Proof. CLAIMl: If P is consistent, then CANON_PROC(P) 
is a canonical model of P. 

Clearly, I = CANON_PROC(P) satisfies aU facts and in- 
tegrity constraints in P. Hence, we shall consider programs 
that only consist of rules in this proof. We say I L-canonically 
satisfies P iff / canonically satisfies {r £ P \ head{r) — L}. 
Clearly, I canonically satisfies P if for all L £ C, P L- 
canonically satisfies by /. We say that I is an {L,c,q)- 
canonically consistent interpretation if for c £ Q, for the 
first t£T - TTS{I, c, L, P) - {0}, I{t){c) |= (L, hud) where 
{L,bnd) £ I{t — l)(c). Consider some L £ C and c £ Q. 
Clearly, I is an (L, c, 0)-model for P. Let us assume, for 
some value q, that I is an {L, c,q—l) model for P. Let time 
point t be the q-th element of r - TTS{I,c,L,P) - {0}. 
Consider the time step before time t is considered in the for- 
loop at line|4]of CANON.PROC, which causes the condition 



at line[5]to be true. By linejTs) r - TTS{I, c, L, P) - {0} C 
cur_free[c][L]. This means that t is a member of both. 
Hence, when t is considered at line|4] the condition at linejS] 
is true, causing {L,bnd) £ I{t){c) n I{t — l)(c) and as the 
element {L,bnd) £ I{t — l)(c) is not changed here, we have 
shown the I is an {L, c, g)-model for P. By the for-loop at 
line[6j for all L £ C and c £ Q, I is an {L, c, q)-model for P. 
Hence, at the for-loop at line|4] we can be assured that for 
L £ C and c£ g that / (L, c, |t - TTS{I, c, L, P) - {0}|) 
satisfies P - which means that I canonically satisfies P 
CLAIM 2: If 7 is a canonical model for P, 
cur_interp C / is an interpretation that also strictly satisfies 
all non-fluent facts in P, and curAnterp' is curAnterp af- 
ter being manipulated in lines [gpO] of CANON_PROC, then 
curAnterp' C I. 

We note that by the definition of satisfaction of a non- 
fiuent fact, and the fact that both cur_interp and / must 
strictly satisfy all non-fluent facts in P, we know that for all 
c £ Q and L £ C that: 

TTS{I, c, L, P) = TTS{curAnterp, c, L, P) 
= TTS{curAnterp' ,c,L,P) 



Let us assume that lines [6|l0| of the algorithm are changing 
cur_interp when the outer loop is considering time t and 
that the condition at line |5] is true. Clearly, 

t£T-TTS{I,v',L',P)-{Q} 



As a result, for any {v,L) pair considered at this point 
by the algorithm, if (L, bnd) £ I{t){v) and {L, bnd') G 
I{t — l){v) then we have bnd = bnd' . By the algorithm, 
if we have (L, bnd*) £ curAnterp'{t){v) and (L, bnd**) G 
curAnterp'it — l){v) we have that fend* = bnd** . As 
(L, bnd**) G cur_interp{t — 'i){v), we know that fenci' C 
bnd** . As a result, we have curAnterp' C J, completing the 
claim. 

Proof of theorem: As initially curAnterp — Fp and T*p C I 
by Theorem |3.3| we note that the algorithm changes 
curAnterp either by applying F or manipulating it in linesIS]- 



10 which tells us (by claim 2) that for all models J of P that 
CANON_PROC(P) C /. Since by claim 1 we know that 
CANON_PROC(P) \= P, the statement of the theorem foF 
lows. □ 

8.7 Details on the Extracted Dataset 

One way in which MANCaLog can be used is looking at 
proliferation of research on different topics. We look at 
research conducted on niacin, an organic compound com- 
monly used for increasing levels of high density lipopro- 
teins (HDL). Using Thomson Reuters Web of Knowledge 
(http://wokinfo.com) we were able to extract information 
on 4, 202 articles about niacin. This information was then 
processed using the Science of Science (Sci^) Tool (httpj 
^/ sci2 . ens . iu . edu) to extract numerous different networks 
such as author by paper networks, citation networks, and pa- 
per by subject networks. Each paper has attributes about 
when it was published, what journal it was published in, and 
what subjects the paper was about. During the first time 
period there is a total of 508 papers with 856 different au- 
thors and 1, 231 connections based on an author being cited 
as an author of a given paper. During the second time pe- 
riod, there is a total of 3, 790 papers with 10, 465 different 
authors and 16, 772 connections. 



